The following content has been provided by the University of Erlangen-Nürnberg.
OK welcome everyone.
So last time we started with recurrent neural networks and I want to repeat for you in the
first few minutes what this is about.
Here I show the standard situation that we want to solve.
Instead of being presented say with an image and having to assign a category to this image,
we are rather presented with a time series and we have to do something with that time
series.
Typically we want to generate another time series or maybe we want to classify that time
series in the end.
And the problem with all of this is that in contrast to images where you may get away
with convolutional neural networks that have only local information, here sometimes you
really need long term memory.
And an example is shown here.
So that would be the time series of observations that reach your network and the network has
to react to these observations.
And it could happen that you need to recall something that you learned way in the past.
And the example I told you last time was if you have a self-driving car and it sees a
sign that says turn left in 100 meters, then in 100 meters when it really arrives at the
junction it has to recall that it should turn left.
So the way it would look like here would be that most of the time there is actually no
important observation, at least for that part of the network that is trained to look for
such signs.
And then there is suddenly a signal and afterwards there is again no important signal until finally
it arrives at the junction and has to recall what was the instruction.
And I made a point that this is relatively hard to implement.
So the solution for that are recurrent neural networks, recurrent because some of the information
that is available at one moment in time recurs again at the next step and the next step.
So we have some sort of memory.
And a generic picture for such a recurrent neural network is shown here.
So in all of the following typically time runs to the right and the different layers
of the network would run on the vertical axis just like before.
So in a normal neural network what you would have is that at each time step you have an
input and then from that you calculate an output maybe going through several layers
of the network and that's it.
And then for the next time step you would start fresh, you get an input, you calculate
an output and again and again.
But there would be no network obviously in such a structure.
So what is done here instead is that in addition to feeding the input that is the external
observation into the network at each time step, you also feed the result of a previous
calculation of your network into the current time step.
So the network knows the output neuron values of some layer as they existed in a previous
time step and this is what these horizontal arrows should represent.
So then this neuron will receive both from the input as well as from the previous output
for some information and then it can calculate some nonlinear function of this combination
of these two inputs, namely the real external input and the memory.
And so then we can draw this as a function of time.
It's always the same network structure.
This is why this repeats at every time step.
These will also be the same weights in every time step.
Presenters
Zugänglich über
Offener Zugang
Dauer
01:29:01 Min
Aufnahmedatum
2017-06-29
Hochgeladen am
2017-06-30 10:50:24
Sprache
en-US